Parallel Performance Optimizations on Unstructured Mesh-based Simulations

نویسندگان

  • Abhinav Sarje
  • Sukhyun Song
  • Douglas Jacobsen
  • Kevin A. Huck
  • Jeffrey K. Hollingsworth
  • Allen D. Malony
  • Samuel Williams
  • Leonid Oliker
چکیده

This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intraand inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both interand intra-node data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches. We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing CAD and Mesh Generation Workflow for SeisSol

SeisSol is a simulation software for seismic wave propagation and earthquake scenarios. It solves the fully elastic wave equations in heterogeneous media. Incorporating dynamic rupture simulation it performs complex multiphysics earthquake simulations. To account for complicated geometries SeisSol uses a fully unstructured tetrahedral mesh. Recent publications [1], [2] have shown that SeisSol i...

متن کامل

JSweep: A Patch-centric Data-driven Approach for Parallel Sweeps on Large-scale Meshes

In mesh-based numerical simulations, sweep is an important computation pattern. During sweeping a mesh, computations on cells are strictly ordered by data dependencies in given directions. Due to such a serial order, parallelizing sweep is challenging, especially for unstructured and deforming structured meshes. Meanwhile, recent high-fidelity multi-physics simulations of particle transport, in...

متن کامل

Performance Analysis and Optimization of the OP2 Framework on Many-Core Architectures

This paper presents a benchmarking, performance analysis and optimization study of the OP2 ‘active’ library, which provides an abstraction framework for the parallel execution of unstructured mesh applications. OP2 aims to decouple the scientific specification of the application from its parallel implementation, and thereby achieve code longevity and near-optimal performance through re-targetin...

متن کامل

Compiler Optimizations for Industrial Unstructured Mesh CFD Applications on GPUs

Graphical Processing Units (GPUs) have shown acceleration factors over multicores for structured mesh-based Computational Fluid Dynamics (CFD). However, the value remains unclear for dynamic and irregular applications. Our motivating example is HYDRA, an unstructured mesh application used in production at Rolls-Royce for the simulation of turbomachinery components of jet engines. We describe th...

متن کامل

Sustained Petascale Performance of Seismic Simulations with SeisSol on SuperMUC

Seismic simulations in realistic 3D Earth models require petaor even exascale computing power to capture small-scale features of high relevance for scientific and industrial applications. In this paper, we present optimizations of SeisSol – a seismic wave propagation solver based on the Arbitrary high-order accurate DERivative Discontinuous Galerkin (ADER-DG) method on fully adaptive, unstructu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015